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This  final  report  describes  the  progress  and  accomplishments  made  on  the  project 
entitled,  “Learning  Integrated  Visual  Databases  for  Image  Exploitation,”  during  the 
period  from  March  15, 1997  to  August  31, 2002. 

1.  OBJECTIVES: 

The  DoD  has  critical  needs  for  robust  high  performance  automated  systems  that  can 
recognize  objects  in  reconnaissance  imagery  acquired  under  dynamically  changing 
conditions  and  for  systems  that  can  efficiently  extract  information  from  enormous  image 
databases.  Our  research  addresses  two  interrelated  problems  with  the  effectiveness  and 
efficiency  of  automated/semi-automated  techniques  for  image  understanding.  First,  the 
lack  of  robustness  in  algorithms  and  systems  for  object  recognition  with  changing 
environments  and  extended  operating  conditions.  Second,  the  lack  of  scalable  intelligent 
strategies  for  quickly  extracting  meaningful  information  from  enormous,  dynamically 
changing  image  databases.  Our  research  is  aimed  at  developing  image  understanding 
(IU)  algorithms  and  systems  that  have  performance  prediction  and  learning  capabilities 
and  that  can  improve  their  performance  with  experience,  in  terms  of  quality  of  results, 
processing  speed  and  matching  with  the  user's  perception. 

The  specific  subgoals  explored  are: 

Fundamental  theory  for  predicting  the  performance  of  object  recognition  systems  and  its 
validation  on  SAR  images, 

Automatic  methods  for  recognizing  articulated,  occluded  and  configuration  variants  of 
targets  in  SAR  images  and  video, 

Adaptive  learning  integrated  target  recognition  algorithms/systems,  and 

Learning  visual  concepts  in  images/videos  with  user  interaction  and  experience  over 

time. 

We  have  developed  promising  fundamental  approaches  and  obtained  excellent  results  to 
solve  some  of  the  crucial  problems  in  image  understanding  and  image  databases  that  will 
have  strong  impact  in  solving  real-world  DoD  applications.  In  the  following  we  describe 
the  major  accomplishments  achieved  during  the  reporting  period.  Specific  aspects  of  the 
research  are  given  in  greater  detail  in  separate  papers  published  in  journals  and  major 
conferences.  A  list  of  all  the  published  papers  is  also  presented. 
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2.  RESEARCH  ACCOMPLISHMENTS 

2.1  Recognition  Performance  Prediction  and  Fundamental  Performance  Bounds 

We  have  developed  fundamental  techniques  for  predicting  the  performance  of  model- 
based  object  recognition  systems  in  the  presence  of  data  uncertainty,  occlusion  and 
clutter.  These  techniques  determine  fundamental  performance  bounds  (lower  and  upper) 
and  set  the  limits  on  what  is  possible  for  a  feature-based  object  recognition  system.  The 
new  techniques  capture  the  structural  similarity  between  model  objects,  which  is  a 
fundamental  factor  in  determining  the  recognition  performance.  We  have  done 
experiments  to  successfully  validate  the  theory  by  comparing  predicted  PCR  plots  with 
ones  that  are  obtained  experimentally  using  MSTAR  SAR  data  collected  by  AFRL  for 
target  recognition  research  under  extended  operating  conditions. 

In  addition,  we  have  developed  SAR  ATR  algorithms  that  explicitly  account  for  model 
similarity.  In  this  work  we  optimize  recognition  models  for  SAR  signatures  of  vehicles  to 
improve  the  performance  of  a  recognition  algorithm  under  the  extended  operating 
conditions  of  target  articulation,  occlusion  and  configuration  variants.  The  recognition 
models  are  based  on  quasi-invariant  local  features,  scattering  center  locations  and 
magnitudes.  The  approach  determines  the  similarities  and  differences  among  the  various 
vehicle  models.  Methods  to  penalize  similar  features  or  reward  dissimilar  features  are 
used  to  increase  the  distinguishability  of  the  recognition  model  instances.  Extensive 
experimental  results,  in  terms  of  confusion  matrices  and  ROC  curves,  demonstrate  the 
improvements  in  recognition  performance  for  real  SAR  signatures  of  vehicle  targets  with 
articulation,  configuration  variants  and  occlusion. 

Journal  Publications: 

1.  M.  Boshra  and  B.  Bhanu,  “Predicting  performance  of  object  recognition,”  IEEE 
Transactions  on  Pattern  Analysis  and  Machine  Intelligence,  Vol.  22,  No.  9,  pp.  956-969, 
2000. 

2.  M.  Boshra  and  B.  Bhanu,  “Predicting  an  upper  bound  on  SAR  ATR  performance,” 
IEEE  Transactions  on  Aerospace  and  Electronic  Systems,  Vol.  37,  No.  3,  pp.  876-888, 
2001. 
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3.  B.  Bhanu  and  G.  Jones,  “Increasing  the  discrimination  of  SAR  recognition  models,” 
Optical  Engineering,  Vol.  41,  No.  12,  December  2002. 

2.2  Automatic  Target  Recognition  in  Extended  Operating  Conditions  — 
Articulated,  Occluded  and  Configuration  Variants  of  Targets 

We  have  developed  a  model-based  SAR  recognition  system  that  uses  standard  non- 
articulated  models  of  objects  to  recognize  the  same  objects  in  non-standard,  occluded  and 
articulated  configurations.  The  system  is  based  on  the  quasi-invariance  of  radar  scatterer 
locations  and  magnitudes,  while  an  accumulation  of  evidence  from  local  features 
recognition  approach  successfully  handles  articulation,  occlusion  and  configuration 
variants. 

The  independent  views  in  SAR  are  an  opportunity  for  increased  recognition  performance. 
The  focus  of  this  research  has  been  to  optimize  the  recognition  of  vehicles  using  multiple 
SAR  recognizers.  Both  recognition  from  multiple  look  angles  and  multiple  recognizers 
with  different  parameter  tunings  are  investigated.  Extensive  experimental  recognition 
results,  in  terms  of  receiver  operating  characteristic  (ROC)  curves,  show  the  effects  on 
recognition  performance  for  MSTAR  vehicle  targets  with  articulation,  configuration 
variants  and  occlusion. 

Journal  Publications 

1.  G.  Jones  and  B.  Bhanu,  “Recognition  of  articulated  and  occluded  objects,”  IEEE 
Transactions  on  Pattern  Analysis  and  Machine  Intelligence,  Vol.  21,  No.  7,  pp.  603-613, 
1999. 

2.  B.  Bhanu  and  G.  Jones,  “Recognizing  target  variants  and  articulations  in  synthetic 
aperture  radar  images,”  Optical  Engineering,  Vol.  39,  No.  3,  pp.  712-723,  2000. 

3.  G.  Jones  and  B.  Bhanu,  “Recognizing  occluded  objects  in  SAR  images,”  IEEE 
Transactions  on  Aerospace  and  Electronic  Systems,  Vol.  37,  No.  1 ,  pp.  316-328,  200 1 . 
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4.  G.  Jones  and  B.  Bhanu,  “Recognizing  articulated  targets  in  SAR  images,”  Pattern 
Recognition,  Vol.  34,  No.  2,  pp.  469-485,  2001. 

5.  J.S.  Ahn  and  B.  Bhanu,  “Model-based  recognition  of  articulated  objects,”  Pattern 
Recognition  Letters,  Vol.  23,  No.  8,  pp.  1019-1029,  2002. 

6.  B.  Bhanu  and  G.  Jones  III.,  “Multiple  look  angle  SAR  recognition,”  International 
Journal  of  Imaging  and  Graphics.  Accepted. 

2.2.1  Stochastic  Models  for  Recognition 

Recognition  of  occluded  objects  in  synthetic  aperture  radar  (SAR)  images  is  a  significant 
problem  for  automatic  target  recognition.  Stochastic  models  provide  some  attractive 
features  for  pattern  matching  and  recognition  under  partial  occlusion  and  noise.  We 
develop  a  hidden  Markov  modeling  (HMM)  based  approach  for  recognizing  objects  in 
synthetic  aperture  radar  (SAR)  images.  We  identify  the  peculiar  characteristics  of  SAR 
sensors  and  using  these  characteristics  we  develop  feature  based  multiple  models  for  a 
given  SAR  image  of  an  object.  The  models  exploiting  the  relative  geometry  of  feature 
locations  or  the  amplitude  of  SAR  radar  return  are  based  on  sequentialization  of 
scattering  centers  extracted  from  SAR  images.  In  order  to  improve  performance  we 
integrate  these  models  synergistically  using  their  probabilistic  estimates  for  recognition 
of  a  particular  target  at  a  specific  azimuth.  Experimental  results  are  presented  using  both 
synthetic  and  real  SAR  images. 

Journal  Publication 

1.  B.  Bhanu  and  Y.  Lin,  “Stochastic  models  for  recognition  of  occluded  objects,”  Pattern 
Recognition,  Revised  September  2002. 

2.2.2  Recognition  of  Human  Articulated  Motion 

Current  gait  recognition  approaches  only  consider  individuals  walking  frontoparallel  to 
the  image  plane.  This  makes  them  inapplicable  for  recognizing  individuals  walking  from 
different  angles  with  respect  to  the  image  plane.  In  this  research,  we  develop  a  kinematic- 
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based  approach  to  recognize  individuals  by  gait.  The  new  approach  estimates  3D  human 
walking  parameters  by  performing  a  least  squares  fit  of  the  3D  kinematic  model  to  the  2D 
silhouette  extracted  from  a  monocular  image  sequence.  A  Genetic  algorithm  is  used  for 
feature  selection  from  the  estimated  parameters,  and  the  individuals  are  then  recognized 
from  the  feature  vectors  using  a  nearest  neighbor  method.  Experimental  results  show  that 
the  approach  achieves  good  performance  in  recognizing  individuals  walking  from 
different  angles  with  respect  to  the  image  plane. 

Reviewed  Conference  Publications 

1.  B.  Bhanu  and  J.  Han,  “Individual  recognition  by  kinematic-based  gait  analysis,” 
Proceedings  International  Conference  on  Pattern  Recognition,  Vol.  Ill,  pp.  343-346, 
2002. 

2.2.3  Moving  Shadow  Detection 

Moving  object  detection  systems  generally  detect  shadows  cast  by  the  moving  object  as 
part  of  the  moving  object.  In  this  work  the  problem  of  separating  moving  cast  shadows 
from  the  moving  objects  in  outdoor  environment  is  addressed.  Unlike  other  previous 
work,  we  provide  a  method  that  does  not  use  any  geometrical  information.  Our  physics- 
based  approach  is  based  on  a  new  spatio-temporal  albedo  normalization  test  and  a 
dichromatic  reflection  model.  The  physics  based  model  is  used  both  in  the  estimation  and 
verification  phases.  We  obtain  results  for  several  different  video  sequences  representing  a 
variety  of  materials  and  shadows.  We  achieve  excellent  results  in  distinguishing  moving 
objects  from  their  shadows.  The  results  indicate  that  our  approach  is  robust  to  a  variety  of 
background  and  foreground  materials  and  varying  illumination  conditions. 

Reviewed  Conference  Publications 

1,  S.  Sohail  and  B.  Bhanu,  “Moving  shadow  detection  using  a  physics-based  approach,” 
Proceedings  International  Conference  on  Pattern  Recognition,  Vol.  II,  pp.  701-704, 
2002. 
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2.3  Adaptation  and  Learning  for  Target  Recognition 

We  have  developed  several  techniques  based  on  reinforcement  learning  for  closed-loop 
target  recognition. 

1.  Reinforcement  Learning:-  team  of  learning  automata:  Current  computer  vision 
systems  whose  basic  methodology  is  open  loop  or  filter  type  typically  use  image 
segmentation  followed  by  object  recognition  algorithms.  These  systems  are  not  robust 
for  most  real-world  applications.  In  contrast,  the  system  presented  here  achieves  robust 
performance  by  using  reinforcement  learning  to  induce  a  mapping  from  input  images  to 
corresponding  segmentation  parameters.  This  is  accomplished  by  using  the  confidence 
level  of  model  matching  as  a  reinforcement  signal  for  a  team  of  learning  automata  to 
search  for  segmentation  parameters  during  training.  The  use  of  the  recognition  algorithm 
as  part  of  the  evaluation  function  for  image  segmentation  gives  rise  to  significant 
improvement  of  the  system  performance  by  automatic  generation  of  recognition 
strategies.  The  system  is  verified  through  experiments  on  sequences  of  indoor  and 
outdoor  color  images  with  varying  external  conditions. 

2.  Delayed  Reinforcement  Learning:  Object  recognition  is  a  multi-level  process 
requiring  a  sequence  of  algorithms  at  low,  intermediate  and  high  levels.  Generally,  such 
systems  are  open  loop  with  no  feedback  between  levels  and  assuring  their  robustness  is  a 
key  challenge  in  computer  vision  and  pattern  recognition  research.  A  robust  closed-loop 
system  based  on  “delayed”  reinforcement  learning  is  introduced.  The  parameters  of  a 
multi-level  system  employed  for  model-based  object  recognition  are  learned.  The 
method  improves  recognition  results  over  time  by  using  the  output  at  the  highest  level  as 
feedback  for  the  learning  system.  Learning  the  parameters  of  image  segmentation  and 
feature  extraction  and  thereby  recognizing  2-D  objects  have  experimentally  validated  it. 
The  approach  systematically  controls  feedback  in  a  multi-level  vision  system  and  shows 
promise  in  approaching  a  long-standing  problem  in  the  field  of  computer  vision  and 
pattern  recognition. 

3.  Use  of  Domain  Knowledge  in  Reinforcement  Learning:  We  have  developed  a 
general  approach  to  image  segmentation  and  object  recognition  that  can  adapt  the  image 
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segmentation  algorithm  parameters  to  the  changing  environmental  conditions. 
Segmentation  parameters  are  represented  using  a  team  of  generalized  stochastic  learning 
automata  and  learned  using  connectionist  reinforcement  learning  techniques.  The  edge- 
border  coincidence  measure  is  first  used  as  reinforcement  for  segmentation  evaluation  to 
reduce  computational  expenses  associated  with  model  matching  during  the  early  stage  of 
adaptation.  This  measure  alone,  however,  cannot  reliably  predict  the  outcome  of  object 
recognition.  Therefore,  it  is  used  in  conjunction  with  model  matching  where  the 
matching  confidence  is  used  as  a  reinforcement  signal  to  provide  optimal  segmentation 
evaluation  in  a  closed-loop  object  recognition  system.  The  adaptation  alternates  between 
global  and  local  segmentation  processes  in  order  to  achieve  optimal  recognition 
performance.  Results  are  presented  for  both  indoor  and  outdoor  color  images  where  the 
performance  improvement  over  time  is  shown  for  both  image  segmentation  and  object 
recognition. 

4.  Adaptive  SAR  ATR  system  based  on  Reinforcement  Learning:  Target  recognition 
is  a  multi-level  process  requiring  a  sequence  of  algorithms  at  low,  intermediate  and  high 
levels.  Generally,  such  systems  are  open  loop  with  no  feedback  between  levels  and 
assuring  their  performance  at  the  given  Probability  of  Correct  Identification  (PCI)  and 
Probability  of  False  Alarm  (Pf)  is  a  key  challenge  in  computer  vision  and  pattern 
recognition  research.  We  have  developed  a  robust  closed-loop  system  for  recognition  of 
SAR  images  based  on  reinforcement  learning.  The  parameters  in  model-based  SAR 
target  recognition  are  learned.  The  method  meets  performance  specifications  by  using 
PCI  and  Pf  as  feedback  for  the  learning  system.  Learning  the  parameters  of  the 
recognition  system  for  SAR  imagery  has  experimentally  validated  the  approach, 
successfully  recognizing  articulated  targets,  targets  of  different  configuration  and  targets 
at  different  depression  angles. 

5.  Adaptive  Recognition  for  Autonomous  Navigation:  Current  machine  perception 
techniques  that  typically  use  segmentation  followed  by  object  recognition  lack  the 
required  robustness  to  cope  with  the  large  variety  of  situations  encountered  in  real-world 
navigation.  Many  existing  techniques  are  brittle  in  the  sense  that  even  minor  changes  in 
the  expected  task  environment  (e.g.,  different  lighting  conditions,  geometrical  distortion, 
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etc.)  can  severely  degrade  the  performance  of  the  system  or  even  make  it  fail  completely. 
We  have  developed  a  system  that  achieves  robust  performance  by  using  local 
reinforcement  learning  to  induce  a  highly  adaptive  mapping  from  input  images  to 
segmentation  strategies  for  successful  recognition.  This  is  accomplished  by  using  the 
confidence  level  of  model  matching  as  reinforcement  to  drive  learning.  Local 
reinforcement  learning  gives  rises  to  better  improvement  in  recognition  performance. 
The  system  is  verified  through  experiments  on  a  large  set  of  real  images  of  traffic  signs. 

Journal  Publications 

1 .  J.  Peng  and  B.  Bhanu,  “Closed  loop  object  recognition  using  reinforcement  learning,” 
IEEE  Transactions  on  Pattern  Analysis  and  Machine  Intelligence,  Vol.  20,  No.  2,  pp. 
139-154,  1998. 

2.  J.  Peng  and  B.  Bhanu,  “Delayed  reinforcement  learning  for  adaptive  image 
segmentation  and  feature  extraction,”  IEEE  Transactions  on  Systems,  Man  and 
Cybernetics,  Vol.  28,  No.  3,  pp.  482-488, 1998. 

3.  B.  Bhanu,  Y.  Lin,  G.  Jones  and  J.  Peng,  “Adaptive  target  recognition,”  International 
Journal  of  Machine  Vision  and  Applications,  Vol.  1 1,  No.  6,  pp.  289-299, 2000 

4.  J.  Peng  and  B.  Bhanu,  “Learning  to  perceive  objects  for  autonomous  navigation,” 
Autonomous  Robots,  Vol.  6,  No.  2,  pp.  187-201, 1999. 

5.  B.  Bhanu  and  J.  Peng,  “Adaptive  integrated  image  segmentation  and  object 
recognition,”  IEEE  Transactions  on  Systems,  Man  and  Cybernetics-  Part  C,  Vol.  30,  No. 
4,  pp.  427-441,2000. 

6.  J.  Peng  and  B.  Bhanu,  “Local  discriminative  learning  for  pattern  Recognition,”  Pattern 
Recognition,  Vol.  34,  No.  1,  pp.  139-150, 2001. 
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2.4  Learning  Concepts  in  Images  and  Videos 

We  have  developed  two  major  ideas  and  approaches  that  relate  low-level  image  features 
with  high-level  visual  concepts  in  images  to  accomplish  image  based  queries  in  large 
databases.  The  challenge  has  been  to  overcome  the  subjective  nature  of  human  image 
interpretation. 

Key  idea  1:  Probabilistic  Feature  Relevance  Learning  for  Content-Based  Image 
Retrieval:  Most  of  the  current  image  retrieval  systems  use  “one-shot”  queries  to  a 
database  to  retrieve  similar  images.  Typically  a  K-nearest  neighbor  kind  of  algorithm  is 
used,  where  weights  measuring  feature  importance  along  each  input  dimension  remain 
fixed  (or  manually  tweaked  by  the  user),  in  the  computation  of  a  given  similarity  metric. 
However,  the  similarity  does  not  vary  with  equal  strength  or  in  the  same  proportion  in  all 
directions  in  the  feature  space  emanating  from  the  query  image.  The  manual  adjustment 
of  these  weights  is  time  consuming  and  it  requires  a  very  sophisticated  user.  We  have 
developed  a  novel  probabilistic  method  that  enables  image  retrieval  procedures  to 
automatically  capture  feature  relevance  based  on  user’s  feedback  and  that  is  highly 
adaptive  to  query  locations.  This  feedback,  in  the  form  of  accept  or  reject  examples 
generated  in  response  to  a  query  image,  is  used  to  locally  estimate  the  strength  of  features 
along  each  dimension  while  taking  into  consideration  the  correlation  between  features. 
This  results  in  local  neighborhoods  that  are  constricted  along  feature  dimensions  that  are 
most  relevant,  while  elongated  along  less  relevant  ones.  In  addition  to  exploring  and 
exploiting  local  principal  information,  the  system  seeks  a  global  space  for  efficient 
independent  feature  analysis  by  combining  such  local  information.  We  provide 
experimental  results  that  demonstrate  the  efficacy  of  our  technique  using  both  simulated 
and  real-world  data. 

Key  Idea  2:  Learning  Visual  Concepts:  We  have  developed  an  approach  for  learning 
visual  concepts  in  images  based  on  statistical  learning  techniques  for  relevance  feedback 
and  fuzzy  clustering.  The  fuzzy  clustering  technique  successfully  handles  conditions 
where  the  concept  features  overlap.  In  this  work,  we  address  the  problem  of 
incorporating  prior  experience  of  the  retrieval  system  to  improve  the  performance  on 
future  queries.  We  develop  a  semi-supervised  fuzzy  clustering  method  to  learn  class 
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distribution  (meta  knowledge)  in  the  sense  of  high-level  concepts  from  retrieval 
experience.  Using  fuzzy  rules,  we  incorporate  the  meta  knowledge  into  a  probabilistic 
relevance  feedback  approach  to  improve  the  retrieval  performance.  Results  on  synthetic 
and  real  databases  show  that  our  approach  provides  better  retrieval  precision  compared  to 
the  case  when  no  retrieval  experience  is  used. 
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